Exploratory Data Analysis for Video Game Development Companies Ryan Reiser¶
This analysis examines the stock data of four major video game development companies across the world from January 1st, 2010 to August 31st, 2024. Analyzed companies include Tencent (China), Sony (Japan), Electronic Arts (USA), and Ubisoft (France). I chose to analyze these companies because I find the gaming industry very interesting in that it often shows trends differing from most other companies, so I chose to view some of the largest publicly traded video game development companies to view. I also wanted to specifically view companies from different countries and cultures to view how that would affect the industry.¶
In [223]:
import yfinance as yf
import pandas as pd
# Define the ticker symbols for the companies
tickers = ['TCEHY', 'SONY', 'EA', 'UBSFY'] # Tencent (China), Sony (Japan), Electronic Arts (USA), Ubisoft (France)
# Set the start and end dates
start_date = '2010-01-01'
end_date = '2024-08-31'
# Initialize an empty DataFrame for storing stock data
stock_prices = pd.DataFrame()
# Fetch and merge the adjusted close prices for each ticker
for ticker in tickers:
# Fetch the data
data = yf.download(ticker, start=start_date, end=end_date, progress=False)
# Resample the data to get the monthly adjusted close prices
monthly_data = data['Adj Close'].resample('ME').last()
# Convert the Series to DataFrame and rename the column to the ticker name
monthly_data = pd.DataFrame(monthly_data)
monthly_data.columns = [ticker]
# Merge with the main DataFrame
if stock_prices.empty:
stock_prices = monthly_data
else:
stock_prices = stock_prices.join(monthly_data, how='outer')
# Display the first few rows of the DataFrame
print(stock_prices.head())
TCEHY SONY EA UBSFY Date 2010-01-31 3.394330 29.390402 15.923065 2.733077 2010-02-28 3.522520 30.168728 16.216486 2.554185 2010-03-31 3.627239 34.015110 18.250889 2.633693 2010-04-30 3.735569 30.375711 18.945318 2.574062 2010-05-31 3.484334 27.322165 16.148027 1.848554
The code downloads the corresponds tickers for each company from the yfinance library, resamples the data to obtain the monthly adjusted close prices, and converts the data into a Pandas DataFrame for easy analysis.¶
In [212]:
# Renames the columns from ticker symbols to company names
stock_prices.columns = ['Tecent (China)', 'Sony (Japan)', 'Electronic Arts (USA)', 'Ubisoft (France)']
In [213]:
stock_prices.tail()
Out[213]:
| Tecent (China) | Sony (Japan) | Electronic Arts (USA) | Ubisoft (France) | |
|---|---|---|---|---|
| Date | ||||
| 2024-04-30 | 43.382523 | 82.570000 | 126.476265 | 4.68 |
| 2024-05-31 | 46.529999 | 82.339996 | 132.710800 | 4.82 |
| 2024-06-30 | 47.360001 | 84.949997 | 139.152573 | 4.32 |
| 2024-07-31 | 46.049999 | 88.589996 | 150.747803 | 4.05 |
| 2024-08-31 | 48.509998 | 97.559998 | 151.820007 | 3.76 |
Displays most recent stock prices for each comapny¶
In [116]:
stock_prices.describe()
Out[116]:
| Tecent (China) | Sony (Japan) | Electronic Arts (USA) | Ubisoft (France) | |
|---|---|---|---|---|
| count | 176.000000 | 176.000000 | 176.000000 | 176.000000 |
| mean | 29.105236 | 47.848018 | 78.611792 | 7.869910 |
| std | 20.385822 | 30.719462 | 46.770197 | 6.015187 |
| min | 3.002863 | 9.015063 | 10.778391 | 1.063415 |
| 25% | 9.330123 | 22.630968 | 24.711078 | 2.732308 |
| 50% | 27.658521 | 34.536722 | 86.080414 | 5.630000 |
| 75% | 43.712784 | 77.133202 | 123.383465 | 13.437500 |
| max | 82.871140 | 124.358925 | 151.820007 | 22.049999 |
Displays commonly used descriptive statistics such as mean, minimum, and maximum stock prices for each company over the timeline¶
In [176]:
import matplotlib.pyplot as plt
for column in stock_prices.columns:
plt.figure(figsize=(10, 6)) # Set the figure size for better readability
plt.plot(stock_prices.index, stock_prices[column], label=column)
plt.title(f'{column} Stock Price (2010-2024)')
plt.xlabel('Year')
plt.ylabel('Adjusted Close Price')
plt.legend()
plt.grid(True)
plt.show()
Displays plots of adjusted close price for each company stock over the 2010-2024 timeline.¶
In [118]:
# Set the figure size for better readability
plt.figure(figsize=(14, 8))
# Plot each column in the DataFrame
for column in stock_prices.columns:
plt.plot(stock_prices.index, stock_prices[column], label=column)
# Set the title and labels
plt.title('Stock Prices of Tencent, Sony, Electronic Arts, and Ubisoft (2010-2024)')
plt.xlabel('Year')
plt.ylabel('Adjusted Close Price')
# Add legend to identify each line
plt.legend()
# Enable grid for better readability
plt.grid(True)
# Show the plot
plt.show()
Combines all of the previously shown plots into one graph, giving each their own color and name in the legend. As can be seen from these plots, all companies follow the same general trends, such as seeing massive growth from 2013-2021 and dipping stock prices in 2019 likely due to mobile gaming beginning to majorly saturate the market. This similarity is to be expected within stocks sharing the same industry. Unlike most other companies, it is interesting to see the videogame industry grow from the 2020 pandemic forcing people to stay indoors for longer. It can be seen that Electronic Arts is the most volatile company with the most growth over the timeline, while Ubisoft is the least volatile and remains relatively stagnant over the timeline, seeing the least growth.¶
In [122]:
stock_prices.corr()
Out[122]:
| Tecent (China) | Sony (Japan) | Electronic Arts (USA) | Ubisoft (France) | |
|---|---|---|---|---|
| Tecent (China) | 1.000000 | 0.850136 | 0.920228 | 0.796266 |
| Sony (Japan) | 0.850136 | 1.000000 | 0.867818 | 0.482502 |
| Electronic Arts (USA) | 0.920228 | 0.867818 | 1.000000 | 0.688926 |
| Ubisoft (France) | 0.796266 | 0.482502 | 0.688926 | 1.000000 |
Displays correlation between each company.¶
In [126]:
import seaborn as sns
import matplotlib.pyplot as plt
# Calculate the correlation matrix
corr = stock_prices.corr()
# Set up the matplotlib figure
plt.figure(figsize=(10, 8))
# Draw the heatmap with the mask and correct aspect ratio
sns.heatmap(corr, annot=True, fmt=".2f", cmap='coolwarm',
xticklabels=corr.columns, yticklabels=corr.columns,
linewidths=.5, cbar_kws={"shrink": .5})
# Set titles and labels
plt.title('Correlation Heatmap of Gaming Development Companies (2010-2024)')
Out[126]:
Text(0.5, 1.0, 'Correlation Heatmap of Gaming Development Companies (2010-2024)')
Visually displays a heatmap of the correlation between each comapany. This data backs up the claim made earlier of the companies following the same general price trends, statistically showing that all four companies are, for the most part, widely correlated due to their similar industry. This display makes it easier to see how closely correlated Electronic Arts and Tencent are, likely due to their similar company sizes, something that is less clear in the earlier plot¶
In [130]:
import numpy as np
# Calculate the logarithmic returns of the stock prices
stock_returns = np.log(stock_prices / stock_prices.shift(1))
# Display the first few rows of the stock_returns DataFrame
print(stock_returns.head())
Tecent (China) Sony (Japan) Electronic Arts (USA) \
Date
2010-01-31 NaN NaN NaN
2010-02-28 0.037070 0.026138 0.018260
2010-03-31 0.029295 0.119999 0.118185
2010-04-30 0.029428 -0.113162 0.037343
2010-05-31 -0.069623 -0.105945 -0.159759
Ubisoft (France)
Date
2010-01-31 NaN
2010-02-28 -0.067695
2010-03-31 0.030654
2010-04-30 -0.022902
2010-05-31 -0.331081
In [132]:
stock_returns.dropna(inplace=True)
In [134]:
stock_returns
Out[134]:
| Tecent (China) | Sony (Japan) | Electronic Arts (USA) | Ubisoft (France) | |
|---|---|---|---|---|
| Date | ||||
| 2010-02-28 | 0.037070 | 0.026138 | 0.018260 | -0.067695 |
| 2010-03-31 | 0.029295 | 0.119999 | 0.118185 | 0.030654 |
| 2010-04-30 | 0.029428 | -0.113162 | 0.037343 | -0.022902 |
| 2010-05-31 | -0.069623 | -0.105945 | -0.159759 | -0.331081 |
| 2010-06-30 | -0.148711 | -0.142951 | -0.136738 | -0.163152 |
| ... | ... | ... | ... | ... |
| 2024-04-30 | 0.114161 | -0.037673 | -0.045096 | 0.115382 |
| 2024-05-31 | 0.070041 | -0.002789 | 0.048118 | 0.029476 |
| 2024-06-30 | 0.017681 | 0.031206 | 0.047399 | -0.109519 |
| 2024-07-31 | -0.028050 | 0.041956 | 0.080037 | -0.064539 |
| 2024-08-31 | 0.052042 | 0.096449 | 0.007087 | -0.074298 |
175 rows × 4 columns
Calculates and displays the logarithmic returns of each of the stock prices during the beginning and end of the timelines for each month¶
In [136]:
import plotly.express as px
# Iterate through each column in the stock_returns DataFrame
for column in stock_returns.columns:
fig = px.line(stock_returns, y=column, title=f'{column} Log Returns (2010-2024)')
fig.update_layout(xaxis_title='Date', yaxis_title='Log Returns', showlegend=False)
fig.show()
Visually displays the logarithmic returns of each company over the timeline¶
In [144]:
import plotly.express as px
import pandas as pd
# Melt the stock_returns DataFrame to long format
long_format = stock_returns.reset_index().melt(id_vars='Date', var_name='Airline', value_name='Log Return')
# Create the line chart using Plotly Express
fig = px.line(long_format, x='Date', y='Log Return', color='Airline',
title='Log Returns of Gaming Development Company Stocks (2010-2024)')
# Update layout for better readability
fig.update_layout(xaxis_title='Date',
yaxis_title='Log Return',
legend_title='Company')
# Show the plot
fig.show()
Combines all previously displayed logarithmic return graphs into one and gives each company a color and name in the legend¶
In [146]:
import plotly.express as px
import pandas as pd
# Assuming long_format is already created from the previous step
# Create the line chart using Plotly Express
fig = px.line(long_format, x='Date', y='Log Return', color='Airline',
title='Log Returns of Gaming Development Company Stocks (2010-2024)')
# Update layout for better readability
fig.update_layout(xaxis_title='Date',
yaxis_title='Log Return',
legend_title='Company',
xaxis=dict(rangeslider=dict(visible=True), type="date"),
)
# Show the plot
fig.show()
Displays graph of all combined company price logarithmic returns and adds timeline slider to only show data during desired time periods. Directly shows the similarity and differences each company saw with their growth between each month. Companies appear to show differing growth and dips likely because these companies will typically see spikes of growth or decline immediately after releasing a new major game depending on how that game sells, therefore making the log returns of video game manufacturing companies commonly unrelated.¶
In [148]:
stock_returns.describe()
Out[148]:
| Tecent (China) | Sony (Japan) | Electronic Arts (USA) | Ubisoft (France) | |
|---|---|---|---|---|
| count | 175.000000 | 175.000000 | 175.000000 | 175.000000 |
| mean | 0.015198 | 0.006856 | 0.012885 | 0.001823 |
| std | 0.093821 | 0.094965 | 0.085419 | 0.122858 |
| min | -0.252244 | -0.247881 | -0.280927 | -0.528020 |
| 25% | -0.044487 | -0.056769 | -0.045017 | -0.069111 |
| 50% | 0.016921 | 0.008685 | 0.009833 | 0.003275 |
| 75% | 0.078133 | 0.072758 | 0.062357 | 0.079992 |
| max | 0.363236 | 0.288128 | 0.266592 | 0.385364 |
Displays commonly used descriptive statistics such as mean, minimum, and maximum for the price logarithmic returns for each company over the timeline¶
In [150]:
import matplotlib.pyplot as plt
# Assuming stock_returns is already defined and contains the log returns for each stock
# Set the number of bins for the histogram
bins = 50
# Iterate through each column in the stock_returns DataFrame
for column in stock_returns.columns:
plt.figure(figsize=(10, 6)) # Create a new figure for each histogram
plt.hist(stock_returns[column].dropna(), bins=bins, alpha=0.7, label=column)
plt.title(f'Histogram of {column} Log Returns')
plt.xlabel('Log Returns')
plt.ylabel('Frequency')
plt.legend()
plt.grid(True)
plt.show()
Displays histograms of each company's frequency of log returns for stock prices. The general left skewness of the graphs show overall stock price growth overtime for all companies, which is especially noticeable in the histograms of Electronic Arts and Ubisoft.¶
In [152]:
stock_returns.corr()
Out[152]:
| Tecent (China) | Sony (Japan) | Electronic Arts (USA) | Ubisoft (France) | |
|---|---|---|---|---|
| Tecent (China) | 1.000000 | 0.338676 | 0.281475 | 0.394269 |
| Sony (Japan) | 0.338676 | 1.000000 | 0.305214 | 0.288814 |
| Electronic Arts (USA) | 0.281475 | 0.305214 | 1.000000 | 0.365324 |
| Ubisoft (France) | 0.394269 | 0.288814 | 0.365324 | 1.000000 |
Calculates and displays the correlations of log stock returns between each company.¶
In [154]:
import seaborn as sns
import matplotlib.pyplot as plt
# Calculate the correlation matrix for the stock_returns DataFrame
corr_matrix = stock_returns.corr()
# Set up the matplotlib figure
plt.figure(figsize=(10, 8))
# Draw the heatmap with the mask and correct aspect ratio
sns.heatmap(corr_matrix, annot=True, cmap='coolwarm', fmt='.2f',
xticklabels=corr_matrix.columns, yticklabels=corr_matrix.columns,
cbar_kws={"shrink": .75}, square=True)
# Add a title
plt.title('Correlation Heatmap of Stock Returns')
# Show the plot
plt.show()
Visually displays the correlations of log stock returns between each company with a heat map. This visualization shows that the log stock returns are not highly correlated between each company likely because, as mentioned earlier, companies will typically see spikes of growth or decline immediately after releasing a new major game depending on how that game sells, therefore making the log returns of video game manufacturing companies not highly correlated.¶
Overall the conducted anaylses of the stock prices of these four companies provides a good understanding of the videogame development industry as a whole. The plots of the stock prices of each company overtime show general market trends with high company stock price correlations, such as how the market dipped in 2019 and how the market grew the first year after the COVID-19 Pandemic. This graph also provided us with information as to what large gaming companies are stagnant, such as Ubisoft, and which ones are highly volatile with rapid growth, such as Electronic Arts. The analysis of the correlations of logarithmic stock price returns between each company shows how the industry tendency for companies to see massive volatility immediately after a game release and then not again until the next release leaves company log returns generally uncorrelated when observing monthly growth trends. Surprisingly, there did not appear to be any major trends that can be attributed to geographical issues, though something like that could be analyzed more if viewing a larger pool of gaming companies.¶
In [ ]: